Systems and methods for determining semiotic similarity between queries and database entries

ABSTRACT

A semiotic analysis system, computer readable medium and method are described. The system has a computer and a searchable database in communication with the computer. The database has a piece of media, a portion of which is associated with a semiotic describer. The computer readable medium has pieces of media, at least one of which has a portion identified by a semiotic describer. The method includes providing a database of pieces of media, portions of which are associated with a semiotic describer. A query having a semiotic signifier is provided, and the query is used to search for and retrieve possibly relevant media corresponding to the query.

CROSS CLAIM TO RELATED APPLICATION

Priority is hereby claimed to U.S. Provisional Patent Application No.60/180,674 filed on Feb. 7, 2001, and is hereby incorporated by thisreference..

BACKGROUND OF THE INVENTION

1. Field Of The Invention

The present invention relates generally to devices and methods ofanalyzing, and to devices and methods of categorizing data.

2. Discussion of Related Art

In the prior art, there are devices and methods that categorize andanalyze text data. One example is disclosed in U.S. Pat. No. 6,006,221(the “'221 Patent”). The '221 Patent discloses a document retrievalsystem where a user can enter a query and retrieve documents from adatabase. Each document in the database is subjected to a set ofprocessing steps to generate a language-independent conceptualrepresentation of the subject content of the document. A query is alsosubjected to a set of processing steps to generate alanguage-independent conceptual representation of the subject content ofthe query. The documents and queries can also be subjected to additionalanalysis to provide additional term-based representations, such as theextraction of information-rich terms and phrases. Documents are matchedto queries based on the conceptual-level contents of the document andquery, and optionally, on the basis of the term-based representation.The query's representation is then compared to each document'srepresentation to generate a measure of relevance of the document to thequery.

The prior art systems sometimes suffer from an inability to properlyidentify some relevant documents. The prior art systems sometimes sufferfrom an inability to properly measure the document's relevance to thequery.

SUMMARY OF THE INVENTION

The invention described and claimed herein represents an improvementover prior art systems in that in many situations, the invention isbetter able to properly identify relevant documents, and is better ableto properly measure the document's relevance to the query. Moreover, theinvention provides a unified architecture to integrate multiple mediaand heterogeneous databases so that they can be analyzed and traversedwith a single query. The invention structures data to permit complexanalytic operations and to equip automated agents to perform theseoperations in interaction with signals provided by a user, a changingdata environment, or other automated agents. The invention also is ableto preserve semiotic information not captured by prior art systems.

The invention includes a system that has a searchable database. Thedatabase has pieces of media. At least some of the pieces of media haveassociated therewith a describer. For example, a describer may indicatea relationship between portions of the piece of media, or a describermay indicate how the portion is used, interpreted or its effect.

The invention also includes a computer readable medium having pieces ofmedia thereon. At least one of the pieces of media has a portionidentified by a describer. The describer indicates a semiotic propertyof the portion.

The invention also includes a method wherein a database of pieces ofmedia is provided with describers corresponding to portions of a pieceof media. A query having semiotic signifiers is provided. The query isused to search for and retrieve possibly relevant media corresponding tothe query.

BRIEF DESCRIPTION OF THE DRAWINGS

For a fuller understanding of the nature and objects of the invention,reference should be made to the following detailed description taken inconjunction with the accompanying drawings, in which:

FIG. 1 is a schematic representation of a system according to thepresent invention that includes a computer in communication with adatabase; and

FIG. 2 represents a ten dimensional semiotic division of the globalinformation space according to the present invention.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 shows a system according to the present invention. The systemincludes a computer 10 that has software thereon. When the software isrunning, the computer 10 will accept a query from a user, and thensearch a database 20 for pieces of media that may be relevant to thequery.

The system also includes a database 20 that is in communication with thecomputer 10. The computer 10 has pieces of media. For example, a pieceof media may be text of a document, or a recording of an audio or videopresentation. The database 20 may exist on a computer readable mediumand may have the pieces of media thereon. Examples of computer readablemediums are a floppy disk, a compact disc, a random access memory and aread only memory. At least one of the pieces of media has a portionidentified by at least one describer. The describer indicates a semioticproperty of the portion.

The describers provide information about portions of the piece of media.For example, if the piece of media includes text, it may be desirable toassociate one or more describers with a portion of the text in order toprovide additional information about the portion.

When it is desired to search the database 20 for relevant media, a queryis developed, and certain portions of the query may be associated withone or more signifiers. For example, the query may include words, someof which are associated with signifiers, and the database 20 describedabove may be searched for the words, and if the words are found in thedatabase 20, a determination is made as to whether describers matchingthe signifiers are present. The query is entered into the computer 10,which then searches the database 20 for pieces of media that may berelevant to the query. For example, if words and signifiers in the querymatch words and describers in the database 20, the corresponding pieceof media is identified as possibly being particularly relevant to thequery.

A portion of a piece of media may have many corresponding describers. Aportion of a query may also have many corresponding signifiers. Piecesof media that are highly related to a query will have more matchesbetween descriptors and signifiers than pieces of media that are nothighly related to the query. For example, consider a portion of a querythat has five signifiers associated with it, a first document that has aportion of text with four describers associated with it, and a seconddocument that has the same portion of text with seven describersassociated with it. Assume the first document's describers include fourof the five signifiers, but the second document's describers includeonly two of the five signifiers. When the database 20 is searched, thefirst document will be identified with the query to a greater degreethan will the second document because a greater number of the firstdocument's describers match the query's signifiers. The user may then beprovided with a prioritized list of documents that may be related to thequery. For example, the first document may be listed on the prioritizedlist before the second document.

In some instances, some descriptors may be more indicative of thecontent of the media than other descriptors, and so the more indicativedescriptors may be given greater weight than the less indicativedescriptors. Similarly, some signifiers may be more indicative of thecontent of the query than other signifiers, and so the more indicativesignifiers may be given greater weight than the less indicativesignifiers. In such a system of weighted descriptors and weightedsignifiers, when a match is found between a descriptor and a signifier,the weight attributable to a descriptor or signifier may be used toprovide an indication that the corresponding media may be of greaterrelevance to the query than are other media that have matching butunweighted descriptors and signifiers. For example, the weights ofdescriptors and signifiers may be mathematically combined, for exampleadded or multiplied together, to provide a total weight of the matchedpair of descriptor and signifier. To illustrate this, consider a querythat has ten signifiers associated with a particular portion of thequery, one of which is weighted by a factor of two to indicate that theweighted signifier is particularly indicative of the query. Consider aportion of a first piece of media that has five descriptors, two ofwhich are weighted by a factor of three to indicate that the weighteddescriptors are particularly indicative of the portion. Also consider asecond piece of media that has seven descriptors, none of which areweighted, associate with the same portion found in the first piece ofmedia. If the database 20 is searched, assume the query results in theportion of the first piece of media being identified, and results in theportion of the second piece of media being identified. If three of thedescriptors in the portion of the first piece of media match three ofthe signifiers, and the weighted descriptors and signifers are amongthose that are matched, then the first piece of media will be identifiedas being of particular relevance to the query. However, if three of thedescriptors in the portion of the second piece of media match three ofthe signifiers, although the second piece of media is identified, thesecond piece of media will not be identified as being of particularrelevance to the query. In this example, the first piece of media willbe determined to be more relevant to the query than the second piece ofmedia. In a prioritized list of media discovered by the search, thefirst piece of media will be listed as being more relevant than thesecond piece of media. Note that the search query may be generalized toinclude complex learning and mining algorithms, as well as agentcommunication protocols, as described below.

One means of providing descriptors and signifiers (collectively referredto herein as “indicators”) is to use a multi-dimensional set ofindicators that can be used to describe and categorize portions ofpieces of media and portions of pieces of a query. Then, when adescriber or signifier is needed, the describer or signifier may beselected from the indicators. It should now be clear that descriptorsare indicators that correspond to media, while signifiers are indicatorsthat correspond to queries. The multi-dimensional set of indicatorsdescribed herein is an example of a semiotic method according to thepresent invention that divides and classifies portions of pieces ofmedia or queries.

In the field of semiotics, the sign relation (“S”) is a triadicprimitive entailing the object (“O”), the sign (“R” for“representamen”), and some effect the sign may have upon an interpretingmind or quasi-mind (“I” for ‘interpretant’). O-R-I is thus anirreducible triadic relation involvingObject-Representamen-Interpretant. Traditionally, the global informationspace is divided into three dyads. A first traditional dyad addressesthe relation R-O, and is known as “semantics”, which deals with therelation of sign to object. A second traditional dyad addresses therelation R-R, and is known as “syntactics”, which deals with therelation of sign to sign. A third traditional dyad addresses therelation R-I, and is known as “pragmatics”, which deals with meaning, orrelation of sign to interpretant. This traditional division of theglobal information space reduces a triadic relation to three dyads,which according to Peirce's reduction theorem, is not possible without asemiotic loss. For more on Peirce's reduction theorem, see A PeirceanReduction Thesis: The Foundations Of Topological Logic, by Robert W.Burch, 1991. Those familiar with Peirce's teachings will know thatrelations of more than three are reducible to triads, but triads are notreducible to dyads and monads without semiotic loss. A goal of theinvention is to reduce, if not eliminate, the semiotic loss associatedwith the traditional division of the global information space.

Instead of subdividing global information space into three dyads at theoutset, the invention treats the triadic relation as continuous withrespect to breadth and depth. With respect to breadth, everyinterpretant may itself be viewed as a sign relation, with its ownobject and interpretant; the O-R-I relation thus entails a continuum ofsemiosis, with each sign generating another.

With respect to depth, any breakpoint along the continuum entails manypossibilities that are resolvable and refinable into triadic patterns.For purposes of describing the invention, ten breakpoints (alsoreferenced herein as “dimensions”) along an O-R-I continuum aredelineated, and each dimension is divisible into three sign elements.FIG. 2 illustrates such a ten dimensional continuum. Each sign elementmay be divided into sub-elements. From time to time it is helpful torefer to the sign elements as being “valence” representations of thecorresponding dimension, and it is helpful to refer to the sub-elementsas being “granular” representations of the corresponding sign element.In this manner, a “richness” or “density” can be realized and exploitedin order to properly identify a piece of media with a query. It shouldbe noted that any datum may be represented by many sign elements in anygiven dimension. This ability to describe a datum with more than onesign element is referred to herein as “compossiblity”. The compossibleaspect of the present invention preserves information that is notpreserved by representations of data that allow only one sign element tobe associated with a datum.

The dimensions in this example are not arbitrarily delineated.Delineation of dimensions is determined by three categories of thoughtthat are postulated as fundamental. First, the “immediate” conception ofa thing in isolation (requiring only one element); second, the “dynamic”conception of the thing in direct relation to some other (requiring twoelements); and third, the “final” conception of the thing in mediatedrelation to some other (requiring three elements). These threecategories may be considered to be moments of thought i.e. first, weconceive the object of our thought in isolation (as a definitionvis-a-vis some other object (a “Secondness”)), then as mediated by somesign or other betweenness (a “Thirdness”). We shall use the subscriptsi, d and f for immediate, dynamic and final, respectively.

For purposes of distinguishing between the dimensions, Roman numerals Ithrough X are used below. No hierarchy is implied by use of the Romannumerals. To give the dimensions greater descriptive force, we use someof Peirce's often difficult terms for the compossible sign elementswithin them, but it should be borne in mind that the terms for the signelements are illustrative and do not exhaust their implications. Eachsign element may be thought of as a valance of the dimension with whichit is associated. The categories of “firstness”, “secondness” and“thirdness” may be applied throughout the invention. Similarly, whereexamples are given, this should not be taken to mean that the example isa pure one to which no other sign element might apply, but simply thatit bears the particular modality under discussion. It should berecognized that the application of one sign element does not excludeothers within the same dimension. With this in mind, the dimensions are:

-   -   I. R-O sub i: the representamen in relation to the immediate        object—that is, the object as contained only in the information        of the sign. This is, in fact, the only way a computer ‘knows’        the object. On this dimension there are three sign elements:        descriptive, denominative, distributive. Examples of sign        elements within this dimension are (1) a descriptor of the        object: e.g. “blue”, (2) a name of the object that distinguishes        it from one or more others, for example, “Blue Boy” (the        painting by Gainsborough), (3) a distributed property, rule or        rule-like relation (or “copulant” in Peirce's terminology) of a        class of objects, for example, the “-ness” aspect of “blueness”,        or the equal sign (“=”) or implication sign (“>”) in their role        as universal copulants.    -   II. R-O sub d: the representamen in relation to the dynamic        object, that is, the object in its relations with the world. On        this dimension we have Peirce's famous icon-index-symbol sign        elements. When the relation of the sign element to its dynamic        object is iconic, the sign element represents the object by        likeness or resemblance, for example, Blue Boy seen as a        portrait of the model. When the relation of the sign to its        dynamic object is indexical, it represents the object by        reference to some collateral information, for example, the        pallor of Blue Boy's face seen by a physician as a symptom of        nutritional deficiency. When the relation of the sign to its        dynamic object is symbolic, the sign represents the object by        some convention or arbitrary habit of association, for example,        the English word ‘boy’ a symbol composed of the letters b-o-y.    -   III. R-O sub f: the representamen in relation to the mode of        being of the object, that is, the object once we have attached a        settled conception to it i.e. the “final” object, as opposed to        the “dynamic” or “immediate” object. In this dimension, the        object may be compossibly abstractive, concretive, collective.        The object is abstractive insofar as the sign represents it as        an isolate, for example, “atom”. This draws on the etymological        meaning of abstract, “to pull out” one element to the exclusion        of others. The object is concretive insofar as the sign        represents it as composed of relations, for example, “molecule”.        This draws on the etymological meaning of concrete as “growing        together” of one element into another. The object is collective        insofar as the sign represents it as an assemblage or grouping,        for example, “matter”. This draws on the etymological meaning of        a collection as a “gathering together”.

Dimensions along the continuum between representamen and interpretant,R-I are set using the immediate-dynamic-final moments of analysis.However, here there is a further two step division depending on whetherthe interpretant is viewed as an occurrent or continuant, that is, as adiscrete effect or as partaking of a stream of activity (hereinsometimes referred to as a “process”). We shall use I′ for theinterpretant as occurrent and I″ for the interpretant as continuant, andagain use the subscripts i, d, and f for immediate, dynamic and final.

-   -   IV. R-I′ sub i: the representamen in relation to its immediate        interpretant, that is, the interpretant in its first blush or        flash of meaning. In this dimension the interpretant may be, in        Peirce's terminology, hypothetical, categorical, relative,        according to whether it is a “might possibly be”, an “is”, or an        “is-a-sign of” in some context.    -   V. R-I′ sub d: the representamen in relation to its dynamic        interpretant, that is the interpretant as it effects or        generates other signs. In this dimension the interpretant may be        sympathetic, percussive/shocking, usual/habitual.    -   VI. R-I′ sub f: what the representamen might signify when        settled into its final scope or sphere of influence. In a first        valence in this dimension, the occurrent final interpretant may        be a mere signifier or terms, or more generally a “seme” in        Peirce's Greek-derived terminology; For example, “theory” in its        barest denotational sense. In a second valence, the occurrent        final interpretant may be a proposition, or more generally a        “pheme”, from the Greek for affirming or asserting; For example,        “the theory of relativity” in its propositional force. In a        third valence, the occurrent final interpretant may be an        argument, or more generally, a “delome” from the Greek for        making known, showing or explaining; For example “E=mc2” in its        explanatory force.    -   VII. R-I″ sub i: the representamen in relation to the stream of        activity or process of its immediate interpretant, that is, as        it receives assurance from the activity of interpretation. In        this dimension the interpretant may be instinctive,        experiential, formal.    -   VIII. R-I″ sub d: the representamen in relation to the process        of its dynamic interpretant, that is, as it appeals to or        becomes part of some ongoing activity interpretation. In this        dimension, the interpretant may be suggestive,        imperative/interrogative, indicative.    -   IX. R-I″ sub f: the representamen in relation to its eventual        purpose, that is, in the light of what it leads to. In this        dimension the mode of the interpretant may be gratific (leading        to emotion), energetic (leading to action), logical (leading to        understanding and, with respect to action, self control).    -   X. Finally, we may consider the sign in itself, without regard        to its relation to object or interpretant. In this dimension,        relevant sign elements include tone-token-type or, more        generally, in Peirce's neologisms as a qualisign, sinsign, or        legisign. The tone is the mode of the sign's intensity. The        token is the mode of its particularity. The type is the mode of        its generality. It is worth noting that “tone” may be further        differentiated using other sign elements of the interpretant in        other dimensions; For example, sympathetic-percussive-usual from        Dimension V, instinctive from Dimension VII-valence 1,        suggestive from Dimension VIII-valence 1, emotional from        Dimension IX-valence 1).

In summary, the ten dimensions are listed below, along with three signelements within each dimension. The sign elements are shown below with acorresponding reference number (1, 2 or 3) that is referred to herein asthe valence number.

-   Dimension I. R-O sub i: (1) descriptive, (2) denominative and (3)    distributive;-   Dimension II. R-O sub d: (1) icon, (2) index and (3) symbol;-   Dimension III. R-O sub f: (1) abstractive, (2) concretive and (3)    collective;-   Dimension IV. R-I′ sub i: (1) hypothetical, (2) categorical and (3)    relative;-   Dimension V. R-I′ sub d: (1) sympathetic, (2) percussive and (3)    usual;-   Dimension VI. R-I′ sub f: (1) term, (2) proposition and (3)    argument;-   Dimension VII. R-I″ sub I: (1) instinctive, (2) experiential and (3)    formal;-   Dimension VIII. R-I″ sub d: (1). suggestive, (2) imperative and (3)    indicative;-   Dimension IX. R-I″ sub f: (1) emotional, (2) energetic and (3)    logical;-   Dimension X. R-R: (1) tone, (2) token and (3) type.

It will now be recognized that the invention has an expanded syntactics(the term “syntactics” is used to mean the relation among signs withoutregard to their meaning or their objects). Unlike the traditionaldefinition of “syntactics”, the syntactics of the invention is notlimited to relations in Dimension X, the sign considered in itself, butmay also involve other sign elements in other dimensions, for example,distributive, symbolic, collective, relative, usual, delomic/argument,formal, indicative, and logical, as well as sign elements in othervalences from selective dimensions, for example, icons from Dimension IImay have syntactic relevance with respect to any discussion ofdiagrammatic reasoning).

Each sign element may be resolved into finer sub-elements. Thus, takingthe “icon” sign element as an example, we may distinguish an indexicalicon from a symbolic icon. Without attaching labels to the granularrefinements, granularity in the “collective” sign element of the “Of”dimension is depicted in FIG. 2 as “1.”, “2.” and “3.”. Each numericalsubscript is capable of further refinement into valences depicted inFIG. 2 as “a.”, “b.” and “c.”. As suggested in the FIG. 2, this processof refinement may continue (i.e. “i”, “ii”, and “iii”) and may be usedto refine sign elements and sub-elements in one or more of thedimensions. It will now be appreciated that the richness of a dimensionis directly proportional to the number of sign elements and sub-elementswithin the dimension.

A system utilizing the ten dimensional set of indicators includes arelational database for the database 20 that is searchable by thecomputer 10. The relational database associates a piece of media withone or more describers selected from a list of possible describers. Forexample, if the describers include the ten dimensions, and eachdimension has three sign elements, 30 different describers will beneeded, one describer for each possible dimension/sign element pair. Forexample, a piece of media may include a word or group of words that areidentified for association with one or more describers. The identifiedwords in the piece of media are then associated with a 32 bit vector.The bits corresponding to the describers to be associated with theidentified words are turned on, for example, by changing the bit from azero to a one. Then, when a search query is formulated such that theidentified words appear in the search query along with signifierscorresponding to the bits turned on, extra weight will be given to thepiece of media in order to identify it as particularly relevant to thequery.

In the example given above, each vector reserves 30 bits, one for eachpossible describer, thereby leaving two bits available for otherpurposes. The remaining two bits may be used as wildcards to signalinformation about the identified words that are not provided by the 30possible describers. For example, the two remaining bits may be used astriggers to greater refinement in a given dimension.

The semiotic dimensions, with their sign elements and valences, areindependent of the type of media, language and content of the media.Consequently, the semiotic dimensions may be applied acrossheterogeneous formats and domains.

As an example of how the invention may be implemented for formattedtext, standard HTML and XML tags may be used as descriptors to provideinformation about a text document. HTML tags for italic, bold,centering, underlining, and title, for example, all may be mapped as“tone” in Dimension X, by virtue of showing emphasis or intensity, andused to enrich standard weighting techniques of search engines. It isbelieved that such an application of the invention using HTML tags,augmented for example by word position, will yield a substantialimprovement in properly identifying documents.

Similarly, HTML tags for picture or graph insets may be semioticallyidentified as an “icon” in Dimension II. HTML hypertext pointers may besemiotically identified both as “index” on Dimension II and “energetic”on Dimension IX. XML wildcards may be semiotically identified for “type”on Dimension X and “distributive” on Dimension I. In Dimension X, aninput compossibly identified as tone, token (the default), and type,would clearly have greater weight if designated in a search query thanthe same input without such identification. When punctuation and Englishorthography are considered as a markup language, further automatedparsing and mapping into the 10-Dimensional system is possible usingorthographic elements as triggers for automated attribution of semioticindicators. Of course, not all data in a piece of media need beassociated with a descriptor, and not all semiotic dimensions need beactivated. Note that although these examples are largely oriented totext, the same principles of application apply to other media and acrossmedia.

The invention not only stratifies information about media, but may beused along with more sophisticated and economical learning and datamining algorithms, logical operations, pattern recognitions, and othermanipulations to enable more information to be gleaned from a search.For example, by taking into consideration how language is usuallycombined to express an idea, general rules may be discerned. Forexample, such rules may be founded on how words are logically combinedor grammatically combined, or such rules may focus on the probabilitythat a certain idea is associated with a portion of a particular mediabased on which describers are “on”.

The invention may be implemented in the emerging discipline of ontologyconstruction, which facilitates inference by providing a type hierarchycharacterized by inheritance relations from upper level to lower levelterms. These inheritance relations are variable, ranging from thebroader and narrower terms of a Thesaurus to multiple inheritance,lattice-like relations in artifacts implementing first order logic andconstructed as partially ordered sets. The most sophisticated ontologiesare typically limited to whole-part and class-member relations. Theinvention provides a means by which an ontology may be enriched so as todifferentiate the hierarchies using, for example, the dimensions III,VI, and IX to distinguish varieties of implication and entailment andthe objects to which they apply. Using these three dimensions, anine-cell matrix for all ontologies may be used: D-III D-VI D-IXabstractive term emotional concretive proposition energetic collectiveargument logical

Leaving aside the “emotional” interpretant (the element of the firstvalence in Dimension-IX), which would not normally be used, and treatingDimension-VI as the relational connector between the final object inDimension-III and the final interpretant in Dimension-IX we already have2×3×3 combinations across the three dimensions. We can then form variousrules to provide further inferences by looking at inclusion relationsand compossibilities within Dimension-III between collectives, on theone hand, and abstractives and concretives, on the other. One way ofdoing this may involve using sub-elements of one or more of the signelements in Dimension VI.

The invention simplifies knowledge discovery and inference by enablinglike-upon-like operations in a complex and heterogeneous database. Forexample, if the final continuant interpretant in Dimension IX is“energetic” (leading to action), the nature of the implication relationwill differ from a “logical” interpretant. For example, “making anomelet” entails “breaking an egg” in a way different from that in whichthe food “egg” includes “omelet” as a narrower term, or in which “egg”implies “chicken”, or vice versa. The invention permits segregation andstratification of differentiated relations, and allows for manipulationof these relations economically. Thus, “making an omelet” and “breakingeggs” would be parsed with indicators “on”]in Dimension IX-valence 2(energetic), Dimension III-valence 2 (concretive), and DimensionVI-valence 3 (argument). By contrast, “omelet” as an “egg” dish would befound in those same dimensions under Dimension IX-valence 3 (logical),Dimension III-valence 2 (concretive) and Dimension VI-valence 1. (term).Without such an ability to differentiate ontologies there will beanomalies of non-transitivity that plague artifacts that mix differenttypes of signs, and try to perform logical operations upon them. Theinvention systematically divides the global information space applicableboth to individual signs and to relations among them, with the relationsalso being susceptible to analysis as sign interpretants.

It is worth noting that the 10-Dimensional system described herein isnot itself an ontology or type hierarchy. Although the sign elements aretypes in the meaning of Dimension X, and in their instantiation as bitsthey are also tokens, their compossible character means that there is nonecessary hierarchical relation among sign elements of the samedimension. However, there may be theorems to limit the patterns in whichthey can co-occur, as described below in more detail.

The invention may be used to facilitate agent protocols. A computationalagent is an autonomous, collaborative, intelligent, protocol-defineddata system capable of acting both upon automatically perceived signalsand on direct signals from a user. The agent is interactive with therelevant fields of data, with other agents and with its user. Theprotocol system of the agent may include parsing and mapping engines,machine learning algorithms, and search, retrieval inference, patternrecognition and knowledge discovery capabilities. The applicationsspecific to agents may focus on the full range of interpretantdimensions of the invention (Dimension IV through Dimension IX). Recallthat the “interpretant” is defined as the effect of the sign on a mindor quasi-mind. As the agent is a quasi-mind, the agent's interpretantsare the signals to which it responds and adapts. For example, for thethree dimensions VII through IX relating the representamen to thecontinuant interpretant (R-I″) in its immediate, dynamic and final formsthe following matrix can be used: VII VIII IX instinctive suggestiveemotional experiential imperative energetic formal indicative logical.Dimension VII elements, when embodied in agent communication protocols,capture and provide modes of agent learning in response to a signal. Anoutside signal whose immediate interpretant is “instinctive” will beaccepted or rejected in accordance with fixed agent behaviors. An“experiential” interpretant will be integrated into case-based learning.A “formal” interpretant will be integrated into rule-based learning. Thedegree to which these dimensions are divided, subdivided and furthersubdivided permits a learning continuum to reflect the adaptivecapabilities of the agent to modify its knowledge and behavior inresponse to its environment, to other agents and user input.

Dimension VIII elements, when embodied in agent communication protocols,capture and provide for modes of agent acceptance of a signal. A“suggestive” signal appeals to possible behaviors without specificity.An “imperative/interrogative” signal provides an order to the agent,which may also be in the form of a query. An “indicative” signalprovides information to be taken into account. Again, finer divisions,subdivisions and further subdivisions permits these modes to flow into acontinuum.

Dimension IX elements, when embodied in agent communication protocols,capture and provide for modes of agent orientation, leading to feelingin the case of the “emotional” interpretant (for a computational agent,a “feeling” may be translated as a value outside the normal economicscale of values), action for the “energetic” interpretant, andunderstanding for the “logical” interpretant.

Another example of how the invention may be implemented is as a semiotictheorem tester. Since Peirce's sign element theory became publicaIlyavailable in the 1950s via incomplete and sometimes inconsistentversions, there has been increasing work to develop and refine it, toprovide formal proofs of the underlying mathematics, and to relate it toemerging work in computational fields. What has been lacking is anempirical testbed designed for the purpose. The 10-Dimensional system ofthe invention provides such a testbed. It opens a new field of algebrathat formally specifies the relations and permissible computationaloperations upon sign elements. A specific focus of such an algebra willbe to test theories about the role of sign valences in attracting orrepelling certain combinations of sign elements, both theoretically andempirically.

At the theoretical level, there has been some work to suggest that theactual number of stable dimension-sign combinations—in principle, 3 tothe tenth power or 59,049 combinations—may be limited to under 100, with66 being the most often cited number. How to prove conjectures of thiskind is itself at issue, since the sign elements are pre-mathematical.Once the relationship between a sign element and its computationalembodiment is established, the corresponding algebra will serve as alaboratory for the study of signs and semiotic combinatorics moregenerally.

At the empirical level, data enriched by some or all of the dimensionscan be used as a training bed to produce algorithms to improve precisionand recall of retrieved information. Such data can also be mined forrules and probabilistic relations that can, in turn, be managed with amodularized addition. For example, an algebra can produce a formallydefined and empirically testable ontology calculus, which willdistinguish the several kinds of inheritance that characterize differentreference ontologies, stipulate rules for storing them efficiently,avoid confusions among them, and permit transitive inference from upperlevel artifacts to domain ontologies of compatible sign elements. At theglobal theoretical level, such an algebra becomes a testbed for theoremsabout the ten dimensional system itself. It has, for example, beenargued that certain dimensions may be subordinated to others to reducethe number of active elements to be considered, or alternatively, thatcombinations containing repelling valences are unstable or impossible.The invention permits experiments to test these hypotheses.

Although the present invention has been described with respect toparticular embodiments, it will be understood that other embodiments ofthe present invention may be made without departing from the spirit andscope of the present invention. Hence, the present invention is deemedlimited only by the appended claims and the reasonable interpretationthereof.

1-83. (canceled)
 84. A computer readable medium including at least oneentry stored thereon, wherein at least one portion of the at least oneentry includes entry signs and is associated with at least one describerthat identifies at least one semiotic property among at least some ofthe entry signs included in the at least one portion, wherein the atleast one semiotic property is a member of a group of semioticproperties, in which the group includes between two and ten differentdimensions of semiotic relations, in which each dimension is dividedinto at least three types of sign elements, each of which can be furthersubdivided, and in which each semiotic property is independent of thethematic content denoted by signs.
 85. The computer readable medium ofclaim 84, wherein the dimension is a relation between a representamenand an immediate object.
 86. The computer readable medium of claim 85,wherein the dimension includes at least one of a descriptive, adenominative, and a distributive element.
 87. The computer readablemedium of claim 84, wherein the dimension is a relation between arepresentamen and a dynamic object.
 88. The computer readable medium ofclaim 87, wherein the dimension includes at least one of an iconic, anindexic, and a symbolic element.
 89. The computer readable medium ofclaim 84, wherein the dimension is a relation between a representamenand a mode of being of an object.
 90. The computer readable medium ofclaim 89, wherein the dimension includes at least one of an abstractive,a concretive, and a collective element.
 91. The computer readable mediumof claim 84, wherein the dimension is a relation between a representamenand an immediate occurent interpretant.
 92. The computer readable mediumof claim 91, wherein the dimension includes at least one of ahypothetical, a categorical, and a relative element.
 93. The computerreadable medium of claim 84, wherein the dimension is a relation betweena representamen and a dynamic occurent interpretant.
 94. The computerreadable medium of claim 93, wherein the dimension includes at least oneof a sympathetic, a percussive, and a usual element.
 95. The computerreadable medium of claim 84, wherein the dimension is a relation betweena representamen and a final occurent interpretant.
 96. The computerreadable medium of claim 95, wherein the dimension includes at least oneof a term, a proposition, and an argument element.
 97. The computerreadable medium of claim 84, wherein the dimension is a relation betweena representamen and an immediate continuant interpretant.
 98. Thecomputer readable medium of claim 97, wherein the dimension includes atleast one of an instinctive, an experiential, and a formal element. 99.The computer readable medium of claim 84, wherein the dimension is arelation between a representamen and a dynamic continuant interpretant.100. The computer readable medium of claim 99, wherein the dimensionincludes at least one of a suggestive, an imperative, and an indicativeelement.
 101. The computer readable medium of claim 84, wherein thedimension is a relation between a representamen and a final continuantinterpretant.
 102. The computer readable medium of claim 101, whereinthe dimension includes at least one of an emotional, an energetic, and alogical element.
 103. The computer readable medium of claim 84, whereinthe dimension is a relation of a representamen to itself, separated fromits relations to objects and interpretants.
 104. The computer readablemedium of claim 103, wherein the dimension includes at least one of atone, a token, and a type element.